Computing with unreliable resources: design, analysis and algorithms
نویسنده
چکیده
This thesis is devoted to the study of computing with unreliable resources, a paradigm emerging in a variety of technologies, such as circuit design, cloud computing, and crowdsourcing. In circuit design, as we approach the physical limits, semiconductor fabrication has been increasingly susceptible to fabrication flaws, resulting unreliable circuit components. In cloud computing, due to co-hosting, virtualization and other factors, the response time of computing nodes are variable. This calls for computation frameworks that take this unreliable quality-of-service into account. In crowdsourcing, we humans are the unreliable computing processors due to our inherent cognitive limitations. We investigate these three topics in the three parts of this thesis. We demonstrate that it is often necessary to introduce redundancy to achieve reliable computing, and this needs to be carried out judiciously to attain an appealing balance between reliability and resource usage. In particular, it is crucial to take the statistical properties of unreliability into account during system design, rather than to handle it as an afterthought. In the first part, we investigate the topic of circuit design with unreliable circuit components. We first analyze the design of Flash Analog-to-Digital Converter (ADC) with imprecise comparators. Formulating this as a problem of scalar quantization with noisy partition points, we analyze fundamental limits on ADC accuracy and obtain designs that increase the yield of ADC (e.g., 5% to 10% for 6-bit Flash ADCs). Our results show that, given a fixed amount of silicon area, building more smaller and less precise comparators leads to better ADC accuracy. We then address the problem of digital circuit design with faulty components. To achieve reliability, we introduce redundant elements that can replace faulty elements via a configurable interconnect. We show that the required number of redundant elements depends on the amount of interconnect available, and propose designs that achieve near-optimal trade-off between redundancy and interconnect overhead in several design settings. The second part of this thesis explores the problem of executing a collection of tasks in parallel on a group of computing nodes. This setting is often seen in cloud computing and crowdsourcing, where the response times of computing nodes are random due to their variability. In this case, the overall latency is determined by the response time of the slowest computing node, which is often much larger than the average response time. Task replication, which sends the same task to multiple computing nodes and obtains the earliest result, reduces latency, but in general incurs additional resource usage. We propose a theoretical framework to analyze the trade-off between latency and resource usage. We show that, while in general there is a tension between latency and resource usage, there exist scenarios where replicating tasks judiciously reduce both latency and resource usage simultaneously. Our investigation gives insights on when and how replication helps, and provides efficient scheduling policies for a variety of computing scenarios. Lastly, we research the problem of crowd-based ranking via pairwise comparisons, with humans as unreliable comparators. We formulate this as the problem of approximate sorting with noisy comparisons. By developing a rate-distortion theory on permutation spaces, we obtain information-theoretic lower bounds for the query complexity of approximate sorting with both noiseless and noisy comparisons. Our lower bound shows the optimality of certain existing algorithms with respect to noiseless comparisons and provides a benchmark for approximate sorting with noisy comparisons. Thesis Supervisor: Gregory W. Wornell Title: Professor of Electrical Engineering and Computer Science
منابع مشابه
INTERVAL ANALYSIS-BASED HYPERBOX GRANULAR COMPUTING CLASSIFICATION ALGORITHMS
Representation of a granule, relation and operation between two granules are mainly researched in granular computing. Hyperbox granular computing classification algorithms (HBGrC) are proposed based on interval analysis. Firstly, a granule is represented as the hyperbox which is the Cartesian product of $N$ intervals for classification in the $N$-dimensional space. Secondly, the relation betwee...
متن کاملImproving the palbimm scheduling algorithm for fault tolerance in cloud computing
Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...
متن کاملTASA: A New Task Scheduling Algorithm in Cloud Computing
Cloud computing refers to services that run in a distributed network and are accessible through common internet protocols. It merges a lot of physical resources and offers them to users as services according to service level agreement. Therefore, resource management alongside with task scheduling has direct influence on cloud networks’ performance and efficiency. Presenting a proper scheduling ...
متن کاملAn Optimal Utilization of Cloud Resources using Adaptive Back Propagation Neural Network and Multi-Level Priority Queue Scheduling
With the innovation of cloud computing industry lots of services were provided based on different deployment criteria. Nowadays everyone tries to remain connected and demand maximum utilization of resources with minimum timeand effort. Thus, making it an important challenge in cloud computing for optimum utilization of resources. To overcome this issue, many techniques have been proposed ...
متن کاملCloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کامل